Fix flat reader subrange decode reuse by lukekim · Pull Request #8596 · vortex-data/vortex

lukekim · 2026-06-25T18:13:20Z

Fixes #8587.

Summary

Memoize FlatReader's decoded array future so synthetic subrange scans share the decoded flat segment instead of issuing repeated decode work.
Add regression coverage for the query patterns that regressed after perf(scan): intra-file decode parallelism — sub-split large chunk spans #8400, including projection-only, filter-only, filtered projection, computed projection, string filtered projection, and string filtered computed projection cases.

Validation

cargo nextest run -p vortex-layout -E 'test(layouts::flat::reader)'
cargo nextest run -p vortex-layout
cargo clippy -p vortex-layout --all-targets --all-features
git diff --check
cargo bench --workspace
Re-ran SQL benches with /opt/homebrew/bin/uv 0.11.24; core suites passed: Appian, TPCH, TPCDS, ClickBench, ClickBench sorted, FineWeb, and GH Archive via direct binary rerun. PolarSignals/StatPopGen exposed pre-existing benchmark-definition/runtime backend failures, and bare Public BI requires --opt dataset=<name>.

Signed-off-by: Luke Kim <80174+lukekim@users.noreply.github.com>

codspeed-hq · 2026-06-26T17:07:49Z

Merging this PR will not alter performance

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 5 improved benchmarks
❌ 3 regressed benchmarks
✅ 1581 untouched benchmarks
⏩ 4 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`chunked_bool_canonical_into[(1000, 10)]`	15.9 µs	26.7 µs	-40.29%
❌	Simulation	`chunked_varbinview_into_canonical[(1000, 10)]`	169.1 µs	205.8 µs	-17.83%
❌	Simulation	`slice_empty_vortex`	310 ns	368.3 ns	-15.84%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[128]`	273.6 ns	215.3 ns	+27.1%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[1024]`	333.9 ns	275.6 ns	+21.17%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[2048]`	427.8 ns	369.4 ns	+15.79%
⚡	Simulation	`chunked_varbinview_canonical_into[(100, 100)]`	259.6 µs	224.5 µs	+15.65%
⚡	Simulation	`chunked_varbinview_into_canonical[(100, 100)]`	306.8 µs	271.9 µs	+12.84%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing spiceai:lukim/8587-regression (0b86845) with develop (bdbf6c4)}

4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Fix flat reader subrange decode reuse

6484492

Signed-off-by: Luke Kim <80174+lukekim@users.noreply.github.com>

lukekim requested a review from a team June 25, 2026 18:13

lukekim mentioned this pull request Jun 25, 2026

perf: Subsplitting large chunks causes some regression for vortex-compact for some benchmarks #8587

Open

Merge branch 'develop' into lukim/8587-regression

0b86845

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix flat reader subrange decode reuse#8596

Fix flat reader subrange decode reuse#8596
lukekim wants to merge 2 commits into
vortex-data:developfrom
spiceai:lukim/8587-regression

lukekim commented Jun 25, 2026

Uh oh!

codspeed-hq Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lukekim commented Jun 25, 2026

Summary

Validation

Uh oh!

codspeed-hq Bot commented Jun 26, 2026

Merging this PR will not alter performance

Performance Changes

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant